Skip to content

add ambiguous claims to contradictions when penalized is true#2644

Merged
penguine-ip merged 1 commit into
confident-ai:mainfrom
miafig:fix/faithfulness-penalize-ambiguous-description
May 12, 2026
Merged

add ambiguous claims to contradictions when penalized is true#2644
penguine-ip merged 1 commit into
confident-ai:mainfrom
miafig:fix/faithfulness-penalize-ambiguous-description

Conversation

@miafig

@miafig miafig commented May 1, 2026

Copy link
Copy Markdown
Contributor

Problem

When penalize_ambiguous_claims=True in FaithfulnessMetric, the reason generated is positive even if when there are penalizations from ambiguous claims.
This happens because the ambiguous claims are not counted as contradictions in _generate_reason().

Fix

Add all ambiguous claims to the contradictions list when penalize_ambiguous_claims=True.

from deepeval.metrics import FaithfulnessMetric
from deepeval.test_case import TestCase

example = LLMTestCase(
    intput="What is the refund policy?",
    actual_output="You can return items within 30 days.",
    retrieval_context=["Return policy 20 days for AMER", "Return policy 30 days for EMEA"],
    )

# Current behaviour
metric = FaithfulnessMetric(penalize_ambiguous_claims=True)
metric.measure(example)
>> 0.0
metric.reason
>> Awesome job! There are no contradictions ...

# New behaviour
metric = FaithfulnessMetric(penalize_ambiguous_claims=True)
metric.measure(example)
>> 0.0
metric.reason
>> The score is 0.00 because the actual output incorrectly claims ...

@vercel

vercel Bot commented May 1, 2026

Copy link
Copy Markdown

@miafig is attempting to deploy a commit to the Confident AI Team on Vercel.

A member of the Team first needs to authorize it.

@penguine-ip

Copy link
Copy Markdown
Contributor

@miafig thank you this is a nice addition, this will make it to the next release :)

@penguine-ip penguine-ip merged commit 14cf30c into confident-ai:main May 12, 2026
8 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants